An improved DNN-based approach to mispronunciation detection and diagnosis of L2 learners' speech

نویسندگان

  • Wenping Hu
  • Yao Qian
  • Frank K. Soong
چکیده

We extend the Goodness of Pronunciation (GOP) algorithm from the conventional GMM-HMM to DNN-HMM and further optimize the GOP measure toward L2 language learners’ accented speech. We evaluate the performance of the new proposed approach at phone-level mispronunciation detection and diagnosis on an L2 English learners’ corpus. Experimental results show that the Equal Error Rate (EER) is improved from 32.9% to 27.0% by extending GOP from GMM-HMM to DNN-HMM and the EER can be further improved by another 1.5% to 25.5% with our optimized GOP measure. For phone mispronunciation diagnosis, by applying our optimized DNN based GOP measure, the top-1 error rate is reduced from 61.0% to 51.4%, compared with the original DNN based one, and the top-5 error rate is reduced from 8.4% to 5.2%. On a continuously read, L2 Mandarin learners’ corpus, our approaches also achieve similar improvements.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved mispronunciation detection with deep neural network trained acoustic models and transfer learning based logistic regression classifiers

Mispronunciation detection is an important part in a Computer-Aided Language Learning (CALL) system. By automatically pointing out where mispronunciations occur in an utterance, a language learner can receive informative and to-the-point feedbacks. In this paper, we improve mispronunciation detection performance with a Deep Neural Network (DNN) trained acoustic model and transfer learning based...

متن کامل

Vowel mispronunciation detection using DNN acoustic models with cross-lingual training

We address the automatic detection of phone-level mispronunciation for feedback in a computer-aided language learning task where the target language data (Indian English) is limited. Based on the recent success of DNN acoustic models on limited resource recognition tasks, we compare different methods of utilizing the limited target language data in the training of acoustic models that are initi...

متن کامل

Automatic generation and pruning of phonetic mispronunciations to support computer-aided pronunciation training

This paper presents a mispronunciation detection system which uses automatic speech recognition to support computer-aided pronunciation training (CAPT). Our methodology extends a model pronunciation lexicon with possible phonetic mispronunciations that may appear in learners’ speech. Generation of these pronunciation variants was previously achieved by means of phone-tophone mapping rules deriv...

متن کامل

Automatic derivation of phonological rules for mispronunciation detection in a computer-assisted pronunciation training system

Computer-Assisted Pronunciation Training System (CAPT) has become an important learning aid in second language (L2) learning. Our approach to CAPT is based on the use of phonological rules to capture language transfer effects that may cause mispronunciations. This paper presents an approach for automatic derivation of phonological rules from L2 speech. The rules are used to generate an extended...

متن کامل

Integrating acoustic and state-transition models for free phone recognition in L2 English speech using multi-distribution deep neural networks

This paper investigates the use of Multi-Distribution Deep Neural Networks (MD-DNNs) for integrating acoustic and statetransition models in free phone recognition of L2 English speech. In Computer-Aided Pronunciation Training (CAPT) system, free phone recognition for L2 English speech is the key model of Mispronunciation Detection and Diagnosis (MDD) in the cases of allowing freely speaking. A ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015